Minimum vertex cover problems on random hypergraphs: 
replica symmetric solution and a leaf removal algorithm 
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We study minimum vertex cover problems on random a-uniform hypergraphs using two differ- 
ent approaches, a replica method in statistical mechanics of random systems and a leaf removal 
algorithm. It is found that there exists a phase transition at the critical average degree e/(a — 1). 
Below the critical degree, a replica symmetric ansatz in the statistical-mechanical method holds and 
the algorithm estimates a solution of the problem which coincides with that by the replica method. 
In contrast, above the critical degree, the replica symmetric solution becomes unstable and these 
methods fail to estimate the exact solution. These results strongly suggest a close relation between 
the replica symmetry and the performance of approximation algorithm. 
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The more crucial part of everyday life computers bear, 
the more significance computer science and information 
theory seem to have. In particular, the computational 
complexity theory shows the difficulty, the limit of im- 
proving algorithms, to solve theoretical computational 
problems. It has revealed that the problems belong to 
several classes such as P and NP and there are many 
inclusion relations between these classes. For example, 
2-satisfiability problems (2-SAT) belong to a class of 
P guaranteed to be solved in polynomial time. 3-SAT 
and the vertex cover problems belong to a class of NP- 
complete These problems are deeply related to the 
well-known P versus NP problem plaguing the theoretical 
computer scientists, who have studied the worst-case per- 
formance to solve the computational problems. Among 
many types of combinatorial optimization problems, the 
minimum vertex cover problem (min-VC) belongs to a 
class of NP-hard. The approximation algorithm for the 
min-VC and its performance have been studied Q . The 
application of the problem is to search a file on a file 
storage Q and to improve the group testing 

In addition to the worst-case analysis, an important 
alternative is the study of typical-case behavior on a 
class of random instances of the computational prob- 
lems. Recently, statistical-mechanical methods of ran- 
dom spin systems have been applied to the problems 
such as AT-SAT and constraint-satisfaction problems [B| . 
These methods, developed in the spin-glass theory Q, en- 
able us to study the typical properties of the randomized 
problems. For example, the statistical-mechanical ap- 
proaches find a SAT/UNSAT transition of AT- SAT 0, p- 
XOR-SAT H, ^-coloring @ and min-VC [IM1- These 
results clarify that there is a so-called replica symmet- 
ric (RS) phase where a replica symmetry ansatz pro- 
vides correct estimates of the typical properties, and a 
replica symmetry breaking (RSB) phase where those esti- 
mates become unstable. Together with these approaches, 
a typical-case performance of some approximation algo- 



rithms has been also studied suggesting that 

there is a non-trivial relation between the replica symme- 
try and the performance of approximation algorithms. 

In this Letter, we study the minimum vertex cover 
problem on a random hypergraph. The random graph 
is defined by two distributions, the degree distribution 
and the edge size distribution. The degree means the 
number of edges connecting to a vertex and the edge size 
represents the number of vertices connected to an edge. 
As the former distribution, the Poisson distribution and 
the delta function are often used and they are called an 
Erodos-Rcnyi random graph and a regular random graph, 
respectively As the latter distribution, one uses the 
delta function with a mean a, which yields a random 
graph with the same edge size as a called a random a- 
uniform hypergraph. In general, a statistical-mechanical 
model defined on a hypergraph has multi-body interac- 
tions determined by its edge size. In contrast to a con- 
ventional two-body interaction, the higher-order multi- 
body interactions often change a type of phase transi- 
tion and a breaking pattern of the replica symmetry as 
shown in the p-body spin glass model (l7j . From this 
viewpoint, influence of an edge size on the typical esti- 
mates of random computational problems is investigated 
by statistical-mechanical approaches. In fact, it has been 
revealed that the edge size changes the properties of some 
problems such as A-SAT @, 0], g-coloring [l8| and min- 
VCs on A-uniform regular random hypergraphs [lil ]. It 
is also found that there exists a P /NP transition between 
2-SAT and 3-SAT [|o[. Here we study the typical case 
of the size of the min-VC, explained later, on random a- 
uniform hypergraphs and focus on the relation between 
the replica symmetry and the performance of an approx- 
imation algorithm called a leaf removal algorithm. 

Let us suppose that an a-uniform hypergraph G = 
{HV, HE) consists of N vertices i G HV = {1, • • • , N} 
and (hyper)edges • ■ ■ ,i a ) G HE C HV a (h < ■ ■ ■ < 
i a ). We define covered vertices IXti <X subset HV c HV 
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and covered edges as a subset of edges connecting to at 
least a covered vertex. The vertex cover problem on the 
hypcrgraph G is to find a set of the covered vertices HV' 
by which all edges are covered. We define the cover ratio 
on G as \HV'\/N with |i?V'| being the size of the vertex 
cover problem. The min-VC on G is to search a set of 
the covered vertices with the minimum cover ratio. In 
the random a-uniform hypergraph all the edges are set 
independently from all a-tuples of vertices with probabil- 
ity p. The degree distribution of the graph converges to 
the Poisson distribution with the average degree c, which 
is given as c = pN a ~ 1 /(a — 1)! for large N. In this Let- 
ter, we focus on an average of the minimum cover ratio 
x c over the sparse random hypcrgraphs with the average 
degree c being 0(1). 

The vertex cover problems are mapped on the lattice 
gas model [HI EH EJ on the random hypergraphs. We 
define a variable vi on each vertex, representing the ex- 
istence of a gas particle, which takes if a vertex i is 
covered and 1 if uncovered. An covered edge has at least 
a vertex with i>i = in its connecting vertices. Thus, 
an indicator function for a given particle configuration 
v_ = {w,-} = {0, 1}^ is defined as 
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which takes 1 if u is a solutions of the vertex cover prob- 
lem on the hypergraph, and otherwise. Using the indi- 
cator function, the grand canonical partition function of 
the model reads 
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where p is a chemical potential and the sum is over all 
configurations of y_. In this formulation, only the solu- 
tions of the vertex cover problem contribute the partition 
function and its ground states in a large \x limit are given 
by the solutions of the min-VC. To study the typical case 
of min-VCs we need to take the average over the random 
hypcrgraphs and the limit asJV-> oo. Then, the average 
minimum-cover ratio is represented as 



x c (c) 
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where {•■■) fl is the grand canonical average and E is 
the average over the random hypcrgraph ensemble. Our 
aim is to obtain the theoretical estimate of the average 
minimum-cover ratio as a function of the average degree 
c. 

The average minimum-cover ratio is derived from the 
averaged grand potential density — (/j,N)~ 1 E In 5, which 
is obtained by u sing the replica method for finite con- 
nectivity graphs |22j . Following the standard procedure 



of the replica method, the original problem is reduced 
to solving a saddle-point equation of a replicated order 
parameter functional. To proceed the calculation, we as- 
sume the RS ansatz that the solution of the saddle-point 
equation has a replica symmetry. Introducing a local field 
on a vertex associated to the order parameter and its dis- 
tribution function, we obtain the saddle-point equation 
of the distribution. Finally, under the RS ansatz, the 
average minimum-cover ratio is obtained as a function of 
the average degree c, 



x c (c) = 1 - 



W((a-l)c) 



(a - l)c 



1 



W((a-l)c) 



a 



(4) 

where W(x) is the Lambert W function defined as 
W(x) cxp(W(x)) — x. We call this estimate the RS so- 
lution of min-VCs. This solution is also obtained by an 
alternative cavity method (l2j| . Although the instability 
of the RS solution such as the de Almeida-Thouless insta- 
bility [23| must be examined to validate the solution, we 
here naively study an instability condition of the saddle- 
point equation against a perturbation of the local field 
distribution within the RS sector. The analysis leads to 
a critical value of the average degree c* = e/(a— 1) above 
which the RS solution becomes unstable. These results, 
x c and c*, include the case of a = 2 [loj . The obtained 
x c gives a correct value below the critical average degree, 
while a RSB solution for x c is required above it. 

Here we turn our attention to the estimate of x c by 
using an approximation algorithm. The leaf removal al- 
gorithm has been proposed as an approximation algo- 
rithm to solve a min-VC on a graph with a = 2 [HI and 



has also been applied to search for a fc-core [25| and a 3- 
XOR-SAT solution [l5|. For a min-VC on a given graph, 
this algorithm consists of iterative steps, where vertices 
called a leaf, as well as the edges connecting to the leaves, 
are removed from the graph with covered vertices appro- 
priately assigned to those vertices. This removal step 
makes new leaves and the algorithm continues in an iter- 
ative way until the leaf is empty. By this procedure, the 
minimum cover ratio is estimated correctly at least for 
the removed part of the graph. We consider the global 
leaf removal (GLR) algorithm [3] , which removes simul- 
taneously all the leaves found in a recursive step. We 
focus on the expansion of this algorithm for the min-VC 
on a hypergraph with a = 3, while it is straightforward 
to extend it to that on a hypergraph with a > 4. A cru- 
cial point in our algorithm is in definition of leaf, where 
a leaf {i,j, k} 6 HV 3 (i < j < k) is defined as a 3-tuple 
of vertices connecting to an edge (i,j,k), at least two 
of which the degree is one. The definition of the GLR 
algorithm is as follows: 

Step 1: The initial graph G is named G' '. Set k = 0. 

Step 2: Search all leaves from the graph G^ k \ If there 
is no leaf, go to Step 6. 
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Step 3: Remove all the leaves except for the vertices 
which belong to more than two leaves, named 
bunch of leaves Q , and remove only one of leaves 
in each bunch. 

Step 4: Assign covered vertices to the one with the max- 
imal degree in each removed leaf from . 

Step 5: The left graph is named G^- k+1 \ and return to 
Step 2 with k incresed by one. 

Step 6: If there exist connected vertices in the left 
graph, assign all of them to covered vertices. Stop 
the algorithm. 

It is proven that the result of the algorithm is indepen- 
dent of order of removal and a selection of a leaf out 
of a bunch of leaves in the removal process. When the 
recursive steps stop, the left graph consists of isolated 
vertices and a core, which is denned as a set of vertices 
connecting to edges without leaves. Vertices in a bunch 
of leaves which are not selected for the removal in Step 
3 become isolated and the core of the order O(N) ex- 
ists in large c. We note that Step 4 can be omitted if 
one is interested only in the minimum cover ratio, not 
the covered vertices. Because the algorithm covers all 
vertices in the core without searching the solution of the 
min-VC as shown in Step 6, the existence of the core 
of the order O(N) leads to overestimation of the aver- 
age minimum-cover ratio. We study the core size at the 
end of the GLR algorithm by numerically performing the 
above-mentioned procedure for finite-size random hyper- 
graphs with a = 3. While the computational time for 
the GLR algorithm is proportional to the number of ver- 
tices, it takes time of the order 0(N 3 ) for generating a 
random graph. To avoid it, we use the microcanonical 
ensemble [14j with fixing the number of edges to the ex- 
pectation number of edges cN/3, ignoring fluctuation of 
the average degree. We expect that such fluctuation is 
irrelevant in a large size N limit. In Fig. [TJ the core size 
density obtained by numerical simulations is presented as 
a function of the average degree c up to the size TV = 10 5 . 
The data averaged over 10 4 random graphs converges well 
for large sizes and a giant core with O(N) emerges above 
a certain value of c. 

We discuss the asymptotic behavior of the recursive 
procedure in the GLR algorithm. We introduce the av- 
erage fraction of the core c n and the isolated vertices 
i n over random hypcrgraphs after n-th step of the algo- 
rithm, and find 

in = e2n+\ + 2e 2 „ + 2ce 2 „e 2 „_ 1 - 2, 

2 3 

c n = e 2n - e 2n +i - 2ce 2 „e 2n _ 1 + 2ce 2 „_ 1 , 

where a parameter e„ obeys a recursion relation e„ = 
exp(— ce n _i) with the initial condition e_i = 0. A de- 
tailed derivation of the formulas will be reported in a 
separate paper [2||. By definition, the average fraction 
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FIG. 1. (Color Online). The core size density in the GLR 
algorithm as a function of the average degree c. Open marks 
are the data obtained by the GLR algorithm with the vertex 
size 10 4 , 5xl0 4 , and 10 5 , which are taken an average over 10 
random hypergraphs. The solid line is the core size density 
predicted by our recursive analysis. The vertical dotted line 
represents the critical average degree c* = e/2. 



of the removed vertices r« up to the n-th step is given 
by r n = 1 — i n — Cn. These fractions are governed by the 
sequence of e n and their values at the end of the algo- 
rithm are determined by the asymptotic behavior of the 
recursion relation of {e n }. It is found that there exists a 
critical average degree c* = e/2 for the recursion relation. 
Below the critical value, the sequence {e„} converges to 
the unique value [W(2c)/(2c)] 1 / 2 and consequently the 
core size Coo is zero. Above the critical value, however, 
a bifurcation occurs in the recursion relation and the se- 
quence has a cycle with period two. This type of the 
transition would occur above a = 3 at the critical aver- 
age degree c* = e/(a — 1). Because e_i = 0, an even 
term e 2rl is larger than that at one-step later, that is 
e 2n +i- We compute the limiting values lim„_ ) . 00 e 2 „+i 
and linin^oo e 2n numerically as a function of c. The dif- 
ference between them yields emergence of the core of the 
order of O(N). We present the core size density obtained 
from the asymptotic analysis of the recursion relation by 
the solid line in Fig. [JJ which coincides with the data by 
numerical simulations. Thus, we confirm that a core per- 
colation occurs at the critical average degree in the GLR 
algorithm, which coincides with that of the RS instabil- 
ity. From the analysis near the critical degree, it is found 
that the size of the core emerges linearly near above the 
critical average degree. These findings, the bifurcation 
in the recursion relation and the core percolation, are 
common in the min-VCs on random graphs with a = 2. 

As mentioned above, the GLR algorithm estimates the 
minimum cover ratio by the size of the removed part in 
the graph during the recursive procedure, which is given 
as Too = 1 — — Coo ■ Taking one-third of and adding 
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Cqo to the value, we obtain the estimate of the average 
minimum-cover ratio by the algorithm. Thus, we find 
that below the critical average degree e/2 the estimate 
Too/3 coincides with the RS solution Eq. (j4]) estimated 
by the replica method. In contrast, the sequence {e„} of 
the algorithm does not converge to a unique value above 
the critical value and the GLR algorithm could not give 
a precise estimate of x c there. 

In order to confirm whether these analyses estimate 
the average minimum-cover ratio x c correctly, we also 
evaluate the min-VCs by the Markov chain Monte Carlo 
method. We use the replica exchange Monte Carlo 
method (EMC) [27|, for accelerating the dynamics of 
the system, with 50 replicas in the range of the chem- 
ical potential from —2 to 10. In our Monte Carlo sim- 
ulations, the smallest cover ratio found in typically 2 17 
Monte Carlo steps is used as the estimate of x c for each 
random graph, which is averaged over 800 hypergraphs 
randomly generated. The number of vertices of the graph 
is up to N = 512. The average minimum-cover ratio is 
extrapolated from these numerical results for finite N. 
Fig.[2]shows the obtained minimum cover ratio as a func- 
tion of the average degree c. Below the critical average 
degree e/2 where the RS solution is considered to be cor- 
rect, we observe that the MC result is consistent with 
those by the two approaches, the replica method and the 
GLR algorithm. Above the critical value, on the other 
hand, the MC estimate stays slightly above that by the 
replica method and considerably deviates from that by 
the GLR algorithm. The former is due to the instability 
of the RS solution and the latter is the existence of the 
core of the order O(N). 

To summarize, we consider the minimum vertex cover 
problems on random a-uniform hypergraphs, and ana- 
lyze them by the statistical-mechanical method and the 
approximation algorithm. The replica method estimates 
the average minimum-cover ratio x c as a function of the 
average degree c under the replica symmetric assump- 
tion. We find that there is an RS/RSB phase transition 
at the critical average degree = e/(a— 1), which is well 
above a percolation threshold c = l/(a— 1) in the random 
graph. We also perform the global leaf removal algorithm 
and study the asymptotic behavior of the recursive pro- 
cedure of the algorithm, particularly in the case of a = 3. 
If the average degree is below the critical value which co- 
incides with that in the replica theory, there is a core of 
the order O(l) in the remaining part of the graph, which 
does not affect the estimate of the minimum cover ratio. 
In contrast, above the critical value, the core of the or- 
der O(N) emerges, leading to a wrong estimation of the 
minimum cover ratio. Comparing the results obtained 
by MC simulations, we confirm that these estimates are 
correct below the critical average degree, but this is not 
the case above the critical degree. These results strongly 
suggest that there is a close relation between the replica 
symmetry in statistical physics and the performance of 
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FIG. 2. (Color Online). The average minimum-cover ratio 
on random a-uniform hypergraphs with a = 3 as a function 
of the average degree c. Open marks are numerical results 
by the exchange MC (diamonds) and by the GLR algorithm 
for N = 10 4 (squares) and 10 5 (triangles). Lines represent 
analytical results by the replica method (solid), by the GLR 
algorithm (dashed) and on the removed part of the graphs by 
the GLR algorithm (dashed-dotted). The vertical dotted line 
is the critical average degree c* = e/2, below which all lines 
merge into a single line. 



the leaf removal algorithm even when the edge size a is 
larger than two. 

It is noted that this relation is not always true for all 
types of random graphs. For instance, the GLR algo- 
rithm removes no vertex on regular random graphs with 
c > 2 because no leaf is found there while, from the point 
of the statistical-mechanical view, the min-VCs on regu- 
lar random 2-uniform graphs with degree 2 is described 
by the RS solution Thus, the relation depends on a 
type of random graphs and approximation algorithms. In 
addition to the leaf removal algorithm, a recent work for 
the min-VC problem with a = 2 [r| suggests that linear 
programming algorithms, which are one of the most com- 
monly used tools for solving optimization problems, have 
the relation discussed in the present work. Further study 
will need to establish the relation between the replica 
symmetry and the performance of numerous algorithms. 

This research was supported by a Grants-in-Aid 
for Scientific Research from the MEXT, Japan, No. 
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